Spatial scalable compression scheme using spatial sharpness enhancement techniques

ABSTRACT

A video encoder/decoder with spatial scalable compression schemes using spatial sharpness enhancement techniques is disclosed. The video compression scheme introduces a number of various video enhancement techniques on the base layer. A picture analyzer is used to determine the best or the best mix of various video enhancement 5 techniques. The picture analyzer compares the selected mix of video enhancement techniques with the original full resolution input signal to determine for which of the pixels or groups of pixels a residual enhancement layer is required. Parameters defining the selected mix of video enhancement techniques are transmitted to the decoder layer so the same mix of video enhancement techniques can be used in the decoder layer.

FIELD OF THE INVENTION

The invention relates to a video encoder/decoder, and more particularlyto a video encoder/decoder with spatial scalable compression schemesusing spatial sharpness enhancement techniques.

BACKGROUND OF THE INVENTION

Because of the massive amounts of data inherent in digital video, thetransmission of full-motion, high-definition digital video signals is asignificant problem in the development of high-definition television.More particularly, each digital image frame is a still image formed froman array of pixels according to the display resolution of a particularsystem. As a result, the amounts of raw digital information included inhigh-resolution video sequences are massive. In order to reduce theamount of data that must be sent, compression schemes are used tocompress the data. Various video compression standards or processes havebeen established, including, MPEG-2, MPEG-4, and H.263.

Many applications are enabled where video is available at variousresolutions and/or qualities in one stream. Methods to accomplish thisare loosely referred to as scalability techniques. There are three axeson which one can deploy scalability. The first is scalability on thetime axis, often referred to as temporal scalability. Secondly, there isscalability on the quality axis, often referred to as signal-to-noisescalability or fine-grain scalability. The third axis is the resolutionaxis (number of pixels in image) often referred to as spatialscalability or layered coding. In layered coding, the bitstream isdivided into two or more bitstreams, or layers. Each layer can becombined to form a single high quality signal. For example, the baselayer may provide a lower quality video signal, while the enhancementlayer provides additional information that can enhance the base layerimage.

In particular, spatial scalability can provide compatibility betweendifferent video standards or decoder capabilities. With spatialscalability, the base layer video may have a lower resolution than theinput video sequence, in which case the enhancement layer carriesinformation which can restore the resolution of the base layer to theinput sequence level.

FIG. 1 illustrates a known layered video encoder 100. The depictedencoding system 100 accomplishes layer compression, whereby a portion ofthe channel is used for providing a low resolution base layer and theremaining portion is used for transmitting edge enhancement information,whereby the two signals may be recombined to bring the system up tohigh-resolution. The high resolution video input is split by splitter102 whereby the data is sent to a low pass filter 104 and a subtractioncircuit 106. The low pass filter 104 reduces the resolution of the videodata, which is then fed to a base encoder 108. In general, low passfilters and encoders are well known in the art and are not described indetail herein for purposes of simplicity. The encoder 108 produces alower resolution base stream which is provided to a second splitter 110from where it is output from the system 100. The base stream can bebroadcast, received and via a decoder, displayed as is, although thebase stream does not provide a resolution which would be considered ashigh-definition.

The other output of the splitter 110 is fed to a decoder 112 within thesystem 100. From there, the decoded signal is fed into an interpolateand upsample circuit 114. In general, the interpolate and upsamplecircuit 114 reconstructs the filtered out resolution from the decodedvideo stream and provides a video data stream having the same resolutionas the high-resolution input. However, because of the filtering and thelosses resulting from the encoding and decoding, certain errors arepresent in the reconstructed stream. These errors are determined in thesubtraction circuit 106 by subtracting the reconstructed high-resolutionstream from the original, unmodified high-resolution stream. The outputof the subtraction circuit 106 is fed to an enhancement encoder 116which outputs a reasonable quality enhancement stream.

The disadvantage of filtering and downscaling the input video to a lowerresolution and then compressing it is that the video loses sharpness.This can to a certain degree be compensated for by using sharpnessenhancement after the decoder. Although this can be made to workreasonably well for most parts of the video picture, there are someareas within the picture where the result remains poor compared to theoriginal picture, e.g., small text parts will remain unreadable evenwith the most sophisticated enhancement.

SUMMARY OF THE INVENTION

The invention overcomes the deficiencies of other known layeredcompression schemes by increasing the video compression of a scalablecompression scheme by the introduction of a number of video enhancementtechniques on the base layer. Using a video picture analyzer, the bestmix of the various video enhancement techniques is determined andparameters defining this mix are transmitted to the decoder section asuser data. The video picture analyzer compares the selected mix ofenhanced bitstreams with the original full resolution input signal anddetermines for which pixels a residual enhancement layer is required.

According to one embodiment of the invention, a method and apparatus forencoding and decoding an input video bitstream is disclosed. A basebitstream and a residual bitstream are encoded in the following manner.A decoded upscaled base bitstream is enhanced in a first plurality ofenhancement units having different enhancement algorithms and aplurality of enhanced base video streams are outputted. The input videobitstream is compared with the decoded upscaled base bitstream and theenhanced base video streams, where the output of the picture analyzercontrols the information contained in the residual bitstream. The basebitstream and the residual bitstream are decoded in the followingmanner. The same enhancement is performed on the decoded base bitstreamas was performed in the encoder unit. The decoded residual bitstream issuperimposed on the decoded and enhanced base video stream to produce avideo output bitstream.

According to another embodiment of the invention, a mix of the enhancedbase video streams and the decoded upscaled base bitstream can be usedto control the information in the encoded residual bitstream, i.e.,which pixels or groups of pixels should be included in the decodedresidual bitstream.

These and other aspects of the invention will be apparent from andelucidated with reference to the embodiments described hereafter.

BRIEF DESCRIPTION OF THE DRAWINGS

The invention will now be described, by way of example, with referenceto the accompanying drawings, wherein:

FIG. 1 is a block diagram representing a known layered video encoder;

FIG. 2 is a block diagram of a layered video encoder/decoder accordingto one embodiment of the invention;

FIG. 3 is a block diagram of a layered video encoder/decoder accordingto one embodiment of the invention; and

FIG. 4 is an illustration of a vector candidate set location accordingto one embodiment of the invention.

DETAILED DESCRIPTION OF THE INVENTION

According to one embodiment of the invention, a spatial scalablecompression scheme using spatial sharpness enhancement techniques isdisclosed. Briefly, the filtered and downscaled video sequence iscompressed. Then, out of the decoded base layer frames, several upscaledversions are processed using a variety of enhancement algorithms. Thiscan include a standard upscaled and filtered, for example, nyquistfiltered, versions as well as various sharpness enhancement algorithmversions. A picture analyzer processes all of the information andselects the best or the best mix of these versions. The mix parameterswhich define the selected mix is also inserted in the encoded residualbitstream, as user data, so that the decoder can exactly reproduce thisenhancement.

However, in some areas of the sharpness enhanced frames, the resultswill remain inadequate. By comparing in the encoder the availableoriginal full resolution frames with the enhancement frames, these areascan be detected. Only these detected areas will be compressed and bepart of the residual bitstream which is inputted into the enhancementlayer. The decoder then decodes the base layer downscaled bitstream andapplies the same enhancement parameters on the decoded output as wasperformed in the encoder. The decoder then decodes the residualbitstream and superimposes the decoded bitstream on the pixels of thealready decoded and enhanced base layer frames.

This embodiment will now be described in more detail with reference toFIG. 2 which is a block diagram of an encoder/decoder which can be usedwith the invention. The depicted encoding/decoding system 200accomplishes layer compression, whereby a portion of the channel is usedfor providing a low resolution base layer and the remaining portion isused for transmitting edge enhancement information, whereby the twosignals may be recombined to bring the system up to high-resolution. Thehigh resolution video input 201 is split by a splitter 210 whereby thedata is sent to a low pass filter 212, for example a nyquist filter, anda splitter 232. The low pass filter 210 reduces the resolution of thevideo data, which is then fed to a base encoder 214. In general, lowpass filters and encoders are well known in the art and are notdescribed in detail herein for purposes of simplicity. The base encoder214 produces a lower resolution base stream 215. The base stream can bebroadcasted, received and via a decoder, displayed as is, although thebase stream does not provide a resolution which would be considered ashigh-definition.

The encoder also outputs a decoded base stream to an upscaling circuit216. In addition, a decoder (not illustrated) can be inserted into thecircuit after the encoder 214 to decode the output of the encoder priorto being sent to the upscaling circuit 216. In general, the upscalingcircuit 216 reconstructs the filtered out resolution from the decodedvideo stream and provides a video data stream having the same resolutionas the high-resolution input. The upscaled bitstream v1 from theupscaling circuit 216 is split by a splitter 218 and inputted into apicture analyzer 230, a subtraction circuit 234 and a splitter 220. Theupscaled bitstream v1 from splitter 220 is inputted into enhancementunits 222 and 224. Each enhancement unit operates a different spatialenhancement algorithm which will be explained in more detail below. FIG.2 has two enhancement units but it will be understood that any number ofenhancement units can be used in the invention.

Many video enhancement techniques exist and they all modify the picturecontent such that the appreciation of the resulting picture is improved.The subjective characteristic of these enhancements complicate theoptimization process and is likely the reason for the diversity in videoenhancement algorithms. Various enhancement algorithms contribute bysome means to the picture quality, and as a result, they often appear ina chain to profit from the individual strengths. Noise reduction andsharpness improvement algorithms are just a few examples out of a largeset of enhancement algorithms. It will be understood that any of theseknown enhancement algorithms can be used in the invention.

A high-quality spatial enhancement function consists of a collection ofalgorithms that contribute to different aspects of sharpness. Somealgorithms only improve the gradients in the picture by increasing itssteepness, whereas others modify the amplitude of the gradients. It mayseem that these algorithms are mutually exclusive, however, this is farfrom true. Both means to improve the gradient characteristics may beused, where a predefined model determines the individual contribution ofeach algorithm.

Returning to FIG. 2, the upscaled bitstreams v1 are processed inenhancement units 222 and 224 according to the enhancement algorithms ineach unit. The resulting video streams from enhancement units 222 and224 are inputted into subtraction units 226 and 228 respectively,wherein the bitstream v1 is subtracted from the resulting video streamsfrom enhancement units 222 and 224 to produce video streams v2 and v3,respectively. Video streams v2 and v3 are inputted into the pictureanalyzer 230. The input bitstream 201 is also inputted into the pictureanalyzer 230 via splitter 232. The picture analyzer 230 compares v1, v2and v3 with the original bitstream and determines how best to enhancethe picture. The picture analysis performed by the picture analyzer canbe performed in a variety of ways. For example, the picture analyzer 230could compare v1, v2 and v3 with the original picture and select thevideo stream (v1, v2 or v3) which best approximates the originalpicture. Alternatively, the picture analyzer can use a mix of thedifferent bitstreams using mix parameters (α, β) or enhancement vectorssuch that the optimum overall picture quality is achieved from acombination of video streams. For example, the picture analyzer canselect a vector representing the mix parameters for calculating themixture of the enhanced base video streams to control the information inthe residual bitstream using the selected vector. Furthermore, bit costfunction can also be used in determining the best mix parameters as willbe explained below with reference to FIG. 3. It will be understood thatother schemes than the ones described can be used in the pictureanalyzer 230 and the invention is not limited thereto.

There are numerous advantages to using mix parameters in the pictureanalyzer 230. Firstly, this is a completely expandable system. If thereare more functions to contribute to the sharpness of the picture, theycan be easily accounted for. The new functions need not be optimized forthe system. Secondly, the interdependencies of various functions can beaccounted for while deciding on the suitable enhancement vectors.Thirdly, a spatio-temporal consistency model can be incorporated in thepicture analyzer 230.

The upscaled output of the upscaling circuit 216 is subtracted from theoriginal input 201 in a subtraction circuit 234 to produce a residualbitstream which is applied to a switch 236. The switch is controlled bythe output of the picture analyzer 230. By comparing the input videobitstream 201 with the various enhanced base video streams, the pictureanalyzer 230 can determine which pixels or groups of pixels (blocks)need to be further enhanced by the enhancement layer 208. For the pixelsor groups of pixels (blocks) that are determined to need enhancement bythe picture analyzer 230, the picture analyzer 230 outputs a controlsignal to close switch 236 to let those parts of the residual bitstreamthrough to the enhancement layer encoder 240. The picture analyzer 230also sends the selected mix parameters and the control signal for theswitch to the encoder 240 so that this information is encoded with theresulting residual bitstream from switch 236 and outputted as theenhancement stream 241.

The base stream 215 is sent to a base decoder 250 and the enhancementstream 241 is sent to an enhancement encoder 252 in the decoder section204. The decoder 250 decodes the base stream 215 which is then upscaledby an upscaling circuit 254. The upscaled decoded bitstream is thensplit by a splitter 256 and sent to enhancement units 262 and 264, mergeunit 270 and addition unit 272. Enhancement unit 262 comprises the samespatial enhancement algorithm as enhancement unit 222 and enhancementunit 264 comprises the same spatial enhancement algorithm as enhancementunit 224. The enhancement units 262 and 264 perform their respectivealgorithms and send outputs v2 and v3 to the merge unit 270.

The enhancement decoder 252 decodes the enhancement stream and outputsthe residual bitstream to the addition unit 272. In addition, thedecoder 252 decodes the mix parameters and control signal and send thisinformation to the merge unit 270. The merge unit merges together all ofthe inputs to create the enhancement output from the picture analyzer230. The upscaled decoded base stream and the decoded residual bitstreamare combined together by the addition unit 272 and the resultingbitstream is applied to the switch 274. The switch 274 is controlled bythe control signal so that the output of the merge unit 270 can beapplied to the appropriate pixels or blocks in the bitstream outputtedby the addition unit 272 so as to produce the output signal 276.

FIG. 3 is a block diagram of an encoder/decoder 300 according to anotherembodiment of the invention. Many of the components in FIG. 3 are thesame as the components illustrated in FIG. 2 so they have been given thesame reference numerals. In addition, for the sake of brevity, theoperations of the similar components will not be described. In thisembodiment, a cost function is used in determining the mix parameters α,β, . . . for the individual enhancement signals v2, v3, . . . .According to one embodiment of the invention, enhancement vectors areassigned on a block by block basis. Previously determined bestenhancement vectors from a spatio-temporal neighborhood are evaluated ina cost function as illustrated in FIG. 4. The cost function calculates ametric that is related to the objective picture quality. The bestestimate of the enhancement vector is defined by one yielding thesmallest cost, i.e., Best vector=min e(α_(i), β_(i), . . . ) where i=1,2, . . . number of candidates and e() is the cost function with vectorsα_(i), β_(i), . . . as parameters.

The cost function should incorporate within itself all the factors thatdefine good quality and also artifact prevention mechanism. For example,in case of sharpness enhancement function, the steepness of thegradients is an important factor and should be accounted for in the costfunction. Artifacts like aliasing that result from sharpness improvementshould also be included in the cost function. The cost function servesas a quality measure.

Returning to FIG. 3, the enhancement layer encoder 240 sends bitcostinformation to the picture analyzer 230. The cost function is calculatedfrom the mixed signal Ve(α,β, . . . ) for a limited set of parametersα,β, . . . . The better the picture quality of the signal Ve(α,β, . . .), the lower the cost function becomes. For every pixel or group ofpixels a few vectors of can be tested. The test vector with the lowestcost function is then selected. In one embodiment, some of the testvectors are already selected vectors of neighboring, in time and space(previous frame), group of pixels. For example, vectors λ₁β₁, λ₂β₂, λ₃β₃illustrated in FIG. 4 are neighbors of the group of pixels being tested.In addition one or more vectors can be selected with a random offset.The picture analyzer outputs Ve=v1+αv2+βv3 and Vb=v1+λv2+μv3 where α, βare the mix parameters and λ, μ are the cost function parameters. Inthis embodiment, the signal Vb is subtracted from the original inputbitstream in the subtractor 234 to form the residual bitstream. Wheneverthe final cost function exceeds a predetermined threshold limit, thepicture analyzer outputs a signal s to the switch 236 so that the switchwill close and for that group of pixels a residual bitstream is encodedin the encoder 240. In addition, the picture analyzer also sends thecontrol signal, the mix parameters and the cost function to the encoder240 which are then coded and inserted into the enhancement stream 241.When the enhancement stream is decoded in the enhancement decoder 252,the mix parameters and cost function are decoded and sent to the mergeunit 270. The merge unit outputs Vb which is added to the decodedenhancement stream in the addition unit 272 and the resulting bitstreamis applied to the switch 274. The switch 274 is controlled by thecontrol signal S so that Ve from the merge unit 270 can be applied tothe appropriate pixels or blocks in the bitstream outputted by theaddition unit 272 so as to produce the output signal 276.

The above-described embodiments of the invention enhance the efficiencyof spatial scalable compression by using a picture analyzer to selectthe best or a mix of a plurality of enhanced base bitstreams viadetermined enhancement vectors to control the information in the encodedresidual bitstream. It will be understood that the different embodimentsof the invention are not limited to the exact order of theabove-described steps as the timing of some steps can be interchangedwithout affecting the overall operation of the invention. Furthermore,the term “comprising” does not exclude other elements or steps, theterms “a” and “an” do not exclude a plurality and a single processor orother unit may fulfill the functions of several of the units or circuitsrecited in the claims.

1. A layered encoder for encoding an input video bitstream, the encodercomprising: a layered encoder unit for encoding a base bitstream at alower resolution and a residual bitstream, the layered encoder unitcomprising: a number of enhancement units, each with a differentenhancement algorithm for enhancing a decoded upscaled base stream andoutputting enhanced base video streams; a picture analyzer for comparingthe input video bitstream with the decoded upscaled base bitstream andthe enhanced base video streams, where the output of the pictureanalyzer controls the information included in the residual bitstream. 2.The layered encoder according to claim 1, wherein the picture analyzerselects a vector representing mix parameters for calculating the mixtureof the enhanced base video streams and controls the information in theresidual bitstream using the selected vector.
 3. The layered encoderaccording to claim 2, wherein the picture analyzer compares the selectedmixture of enhanced base video streams with the input video bitstream todetermine for which pixels or group of pixels additional enhancement isrequired via the residual bitstream.
 4. The layered encoder according toclaim 2, wherein each group of pixels is enhanced using differentvectors.
 5. The layered encoder according to claim 2, wherein thepicture analyzer calculates a cost function for a limited number of testvectors and the test vector with the lowest cost function is selected.6. The layered encoder according to claim 5, wherein the selected vectoris included in a compressed data stream.
 7. The layered encoderaccording to claim 5, wherein a number of the test vectors are alreadyselected vectors of neighboring, in time and space, group of pixels. 8.The layered encoder according to claim 1, wherein the layered encodingunit further comprises: a downsampling unit for reducing the resolutionof the input video bitstream; a base encoder for encoding the lowerresolution base stream; an upscaling unit for decoding and increasingthe resolution of the base stream to produce an upscaled base bitstream;a subtraction unit for subtracting the upscaled base bitstream from theinput video bitstream to produce the residual bitstream; switching meansfor selectively allowing only portions of the residual bitstream to besent to an enhancement encoder based upon a control signal from thepicture analyzer; the enhancement encoder for encoding the portions ofthe residual bitstream which pass through the switching means to formthe encoded residual bitstream.
 9. The layered encoder according toclaim 8, wherein said switching means is a multiplier having a valuebetween 0 and 1, wherein a value of 0 means the switching means is openand a value of 1 means the switching means is closed.
 10. A layereddecoder unit for decoding a base bitstream and a residual bitstream, thelayered decoder unit comprising: means for enhancing the decoded basebitstream, the means for enhancing comprising a plurality of enhancementunits having different enhancement algorithms for outputting an enhancedbase video stream, and means for superimposing the decoded residualbitstream on the enhanced base video stream.
 11. A layered decoder unitas claimed in claim 10, wherein the decoder is arranged to receive avector representing mix parameters for calculating the mixture ofenhanced base streams produced by the plurality of enhancement units inorder to produce the enhanced base video stream.
 12. A method forencoding an input video bitstream the method comprising the steps of:encoding a base bitstream and a residual bitstream, comprising the stepsof: enhancing a decoded upscaled base bitstream in a plurality ofdifferent enhancement algorithms outputting enhanced base video streams;comparing the input video bitstream with the decoded upscaled basebitstream and the enhanced base video streams, where the output of thecomparision controls the information contained in the residualbitstream.
 13. The method according to claim 12, wherein a vectorrepresenting mix parameters for calculating a mixture of the enhancedbase video streams is selected controls the information in the residualbitstream using the selected vector.
 14. The method according to claim13, wherein the selected mixture of enhanced base video streams iscompared with the input video bitstream to determine for which pixels orgroup of pixels additional enhancement is required via the residualbitstream.
 15. The method according to claim 13, wherein each group ofpixels is enhanced using different vectors.
 16. The method according toclaim 13, wherein a cost function for a limited number of test vectorsis calculated and the test vector with the lowest cost function isselected.
 17. The method according to claim 16, wherein the selectedvector is included in a compressed data stream.
 18. The method accordingto claim 16, wherein a number of the test vectors are already selectedvectors of neighboring, in time and space, group of pixels.
 19. Themethod according to claim 12, further comprising the steps of: reducingthe resolution of the input video bitstream; encoding the lowerresolution base stream; decoding and increasing the resolution of thebase stream to produce an upscaled base bitstream; subtracting theupscaled base bitstream from the input video bitstream to produce theresidual bitstream; selectively allowing only portions of the residualbitstream to be sent to an enhancement encoder based upon a controlsignal from the picture analyzer; encoding the selectively allowedportions of the residual bitstream to form the encoded residualbitstream.
 20. A method of decoding a base bitstream and a residualbitstream, the decoding comprising: enhancing the decoded base bitstreamin a plurality of different enhancement algorithms for outputting anenhanced base video stream, and superimposing the decoded residualbitstream on the enhanced base video stream.
 21. A method of decoding asclaimed in claim 20, wherein the method further comprises receiving avector representing mix parameters for calculating the mixture ofenhanced base streams produced by the plurality of enhancement units inorder to produce the enhanced base video stream.
 22. A compressed datastream including: a base bitstream and a residual bitstream, wherein theinformation included in the residual bitstream represents a differencebetween a bitstream at higher resolution than the base bitstream and anenhanced decoded upscaled base bistream, which enhanced decoded upscaledbase bitstream is based on the base bitstream and wherein theenhancement has been performed by a mixture of a plurality ofenhancement algorithms.
 23. A compressed data stream as claimed in claim22, wherein the compressed data stream includes a vector representingmix parameters for calculating the mixture.
 24. A storage medium onwhich a compressed data stream as claimed in claim 22 has been stored.25. A layered encoder/decoder for encoding and decoding an input videobitstream, comprising: a layered encoder unit for encoding a basebitstream at a lower resolution and a residual bitstream, the layeredencoder unit comprising: a number of enhancement units, each with adifferent enhancement algorithm for enhancing a decoded upscaled basestream and outputting enhanced base video streams; a picture analyzerfor comparing the input video bitstream with the decoded upscaled basebitstream and the enhanced base video streams, where the output of thepicture analyzer controls the information contained in the residualbitstream; a layered decoder unit for decoding the base bitstream andthe residual bitstream, the layer decoder unit comprising: means forperforming the same enhancement to the decoded base bitstream as wasperformed in the encoder unit; and means for superimposing the decodedresidual bitstream on the decoded and enhanced video base stream toproduce a video output stream.